Uncertainty decoding for DNN-HMM hybrid systems based on numerical sampling
نویسندگان
چکیده
In this article, we propose an uncertainty decoding scheme for DNN-HMM hybrid systems based on numerical sampling. A finite set of samples is drawn from the estimated probability distribution of the acoustic features and subsequently passed through feature transformations/extensions and the deep neural network (DNN). Then, the nonlinearly-transformed feature samples are averaged at the output of the DNN in order to approximate the posterior distribution of the context-dependent Hidden Markov Model (HMM) states. This concept is experimentally verified for the REVERB challenge task using a reverberation-robust DNN-HMM hybrid system: The numerical sampling is performed in the logmelspec domain, where we estimate the posterior distribution of the acoustic features by combining coherence-based Wiener filtering and uncertainty propagation. The experimental results highlight the good performance of the proposed uncertainty decoding scheme with significantly increased recognition accuracy even for a small number of feature samples.
منابع مشابه
An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems
In this paper, we advance a recently-proposed uncertainty decoding scheme for DNN-HMM (deep neural network hidden Markov model) hybrid systems. This numerical sampling concept averages DNN outputs produced by a finite set of feature samples (drawn from a probabilistic distortion model) to approximate the posterior likelihoods of the context-dependent HMM states. As main innovation, we propose a...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملUncertainty Decoding with Adaptive Sampling for Noise Robust DNN-Based Acoustic Modeling
Although deep neural network (DNN) based acoustic models have obtained remarkable results, the automatic speech recognition (ASR) performance still remains low in noise and reverberant conditions. To address this issue, a speech enhancement front-end is often used before recognition to reduce noise. However, the front-end cannot fully suppress noise and often introduces artifacts that are limit...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملDeep segmental neural networks for speech recognition
Hybrid systems which integrate the deep neural network (DNN) and hidden Markov model (HMM) have recently achieved remarkable performance in many large vocabulary speech recognition tasks. These systems, however, remain to rely on the HMM and assume the acoustic scores for the (windowed) frames are independent given the state, suffering from the same difficulty as in the previous GMM-HMM systems...
متن کامل